Dense prediction tasks such as segmentation and detection of pathological entities hold crucial clinical value in the digital pathology workflow. However, obtaining dense annotations on large cohorts is usually tedious and expensive. Contrastive learning (CL) is thus often employed to leverage large volumes of unlabeled data to pre-train the backbone network. To boost CL for dense prediction, some studies have proposed variations of dense matching objectives in pre-training. However, our analysis shows that employing existing dense matching strategies on histopathology images enforces invariance among incorrect pairs of dense features and, thus, is imprecise. To address this, we propose a precise location-based matching mechanism that utilizes the overlapping information between geometric transformations to precisely match regions in two augmentations. Extensive experiments on two pretraining datasets (TCGA-BRCA, NCT-CRC-HE) and three downstream datasets (GlaS, CRAG, BCSS) highlight the superiority of our method in semantic and instance segmentation tasks. Our method outperforms previous dense matching methods by up to 7.2 % in average precision for detection and 5.6 % in average precision for instance segmentation tasks. Additionally, by using our matching mechanism in the three popular contrastive learning frameworks, MoCo-v2, VICRegL and ConCL, the average precision in detection is improved by 0.7 % to 5.2 % and the average precision in segmentation is improved by 0.7 % to 4.0 %, demonstrating its generalizability.
translated by 谷歌翻译
我们介绍了一种新的图像取证方法:将物理折射物(我们称为图腾)放入场景中,以保护该场景拍摄的任何照片。图腾弯曲并重定向光线,因此在单个图像中提供了多个(尽管扭曲)的多个(尽管扭曲)。防守者可以使用这些扭曲的图腾像素来检测是否已操纵图像。我们的方法通过估计场景中的位置并使用其已知的几何和材料特性来估算其位置,从而使光线通过图腾的光线不十障。为了验证图腾保护的图像,我们从图腾视点重建的场景与场景的外观从相机的角度来检测到不一致之处。这样的方法使对抗性操纵任务更加困难,因为对手必须以几何一致的方式对图腾和图像像素进行修改,而又不知道图腾的物理特性。与先前的基于学习的方法不同,我们的方法不需要在特定操作的数据集上进行培训,而是使用场景和相机的物理属性来解决取证问题。
translated by 谷歌翻译
自动检测异常轨迹是智能运输系统中大量应用的重要问题。许多现有的研究集中在区分异常轨迹和正常轨迹上,忽略了异常轨迹之间的巨大差异。最近的一项研究在鉴定异常轨迹模式方面取得了长足进步,并提出了一种两阶段算法,用于异常轨迹检测和分类(ATDC)。该算法具有出色的性能,但受到了一些局限性,例如高时间的复杂性和不良的解释。在这里,我们对ATDC算法进行了仔细的理论和经验分析,表明可以简化两个阶段的异常得分的计算,并且该算法的第二阶段比第一阶段重要得多。因此,我们开发了一种FastATDC算法,该算法在两个阶段都引入了随机抽样策略。实验结果表明,FastATDC在实际数据集上的速度比ATDC快10到20倍。此外,FastAtDC优于基线算法,与ATDC算法相当。
translated by 谷歌翻译
组织病理学全幻灯片图像(WSIS)在临床研究中起着非常重要的作用,并作为许多癌症诊断的黄金标准。但是,由于其巨大尺寸,生成用于处理WSIS的自动工具是具有挑战性的。当前,为了解决这个问题,传统方法依靠多个实例学习(MIL)策略来处理贴剂级别的WSI。尽管有效,但这种方法在计算上很昂贵,因为将WSI整理成斑块需要时间,并且不探索这些瓷砖之间的空间关系。为了解决这些限制,我们提出了一个本地监督的学习框架,该框架通过探索包含的整个本地和全球信息来处理整个幻灯片。该框架将预训练的网络划分为几个模块,并使用辅助模型在本地优化每个模块。我们还引入了一个随机特征重建单元(RFR),以在训练过程中保留区分特征,并将方法的性能提高1%至3%。对三个公开可用的WSI数据集进行了广泛的实验:TCGA-NSCLC,TCGA-RCC和LKS,突出了我们方法在不同分类任务上的优越性。我们的方法的准确性优于最先进的MIL方法,而高7至10倍。此外,将其分为八个模块时,我们的方法需要端到端培训所需的GPU总内存总数的20%。我们的代码可从https://github.com/cvlab-stonybrook/local_learning_wsi获得。
translated by 谷歌翻译
Convolutional neural networks have been widely deployed in various application scenarios. In order to extend the applications' boundaries to some accuracy-crucial domains, researchers have been investigating approaches to boost accuracy through either deeper or wider network structures, which brings with them the exponential increment of the computational and storage cost, delaying the responding time.In this paper, we propose a general training framework named self distillation, which notably enhances the performance (accuracy) of convolutional neural networks through shrinking the size of the network rather than aggrandizing it. Different from traditional knowledge distillation -a knowledge transformation methodology among networks, which forces student neural networks to approximate the softmax layer outputs of pre-trained teacher neural networks, the proposed self distillation framework distills knowledge within network itself. The networks are firstly divided into several sections. Then the knowledge in the deeper portion of the networks is squeezed into the shallow ones. Experiments further prove the generalization of the proposed self distillation framework: enhancement of accuracy at average level is 2.65%, varying from 0.61% in ResNeXt as minimum to 4.07% in VGG19 as maximum. In addition, it can also provide flexibility of depth-wise scalable inference on resource-limited edge devices. Our codes will be released on github soon.
translated by 谷歌翻译
Digital engineering transformation is a crucial process for the engineering paradigm shifts in the fourth industrial revolution (4IR), and artificial intelligence (AI) is a critical enabling technology in digital engineering transformation. This article discusses the following research questions: What are the fundamental changes in the 4IR? More specifically, what are the fundamental changes in engineering? What is digital engineering? What are the main uncertainties there? What is trustworthy AI? Why is it important today? What are emerging engineering paradigm shifts in the 4IR? What is the relationship between the data-intensive paradigm and digital engineering transformation? What should we do for digitalization? From investigating the pattern of industrial revolutions, this article argues that ubiquitous machine intelligence (uMI) is the defining power brought by the 4IR. Digitalization is a condition to leverage ubiquitous machine intelligence. Digital engineering transformation towards Industry 4.0 has three essential building blocks: digitalization of engineering, leveraging ubiquitous machine intelligence, and building digital trust and security. The engineering design community at large is facing an excellent opportunity to bring the new capabilities of ubiquitous machine intelligence and trustworthy AI principles, as well as digital trust, together in various engineering systems design to ensure the trustworthiness of systems in Industry 4.0.
translated by 谷歌翻译
Transformers are becoming increasingly popular due to their superior performance over conventional convolutional neural networks(CNNs). However, transformers usually require a much larger amount of memory to train than CNNs, which prevents their application in many low resource settings. Local learning, which divides the network into several distinct modules and trains them individually, is a promising alternative to the end-to-end (E2E) training approach to reduce the amount of memory for training and to increase parallelism. This paper is the first to apply Local Learning on transformers for this purpose. The standard CNN-based local learning method, InfoPro [32], reconstructs the input images for each module in a CNN. However, reconstructing the entire image does not generalize well. In this paper, we propose a new mechanism for each local module, where instead of reconstructing the entire image, we reconstruct its input features, generated from previous modules. We evaluate our approach on 4 commonly used datasets and 3 commonly used decoder structures on Swin-Tiny. The experiments show that our approach outperforms InfoPro-Transformer, the InfoPro with Transfomer backbone we introduced, by at up to 0.58% on CIFAR-10, CIFAR-100, STL-10 and SVHN datasets, while using up to 12% less memory. Compared to the E2E approach, we require 36% less GPU memory when the network is divided into 2 modules and 45% less GPU memory when the network is divided into 4 modules.
translated by 谷歌翻译
Autonomous driving is an exciting new industry, posing important research questions. Within the perception module, 3D human pose estimation is an emerging technology, which can enable the autonomous vehicle to perceive and understand the subtle and complex behaviors of pedestrians. While hardware systems and sensors have dramatically improved over the decades -- with cars potentially boasting complex LiDAR and vision systems and with a growing expansion of the available body of dedicated datasets for this newly available information -- not much work has been done to harness these novel signals for the core problem of 3D human pose estimation. Our method, which we coin HUM3DIL (HUMan 3D from Images and LiDAR), efficiently makes use of these complementary signals, in a semi-supervised fashion and outperforms existing methods with a large margin. It is a fast and compact model for onboard deployment. Specifically, we embed LiDAR points into pixel-aligned multi-modal features, which we pass through a sequence of Transformer refinement stages. Quantitative experiments on the Waymo Open Dataset support these claims, where we achieve state-of-the-art results on the task of 3D pose estimation.
translated by 谷歌翻译
Traffic forecasting has attracted widespread attention recently. In reality, traffic data usually contains missing values due to sensor or communication errors. The Spatio-temporal feature in traffic data brings more challenges for processing such missing values, for which the classic techniques (e.g., data imputations) are limited: 1) in temporal axis, the values can be randomly or consecutively missing; 2) in spatial axis, the missing values can happen on one single sensor or on multiple sensors simultaneously. Recent models powered by Graph Neural Networks achieved satisfying performance on traffic forecasting tasks. However, few of them are applicable to such a complex missing-value context. To this end, we propose GCN-M, a Graph Convolutional Network model with the ability to handle the complex missing values in the Spatio-temporal context. Particularly, we jointly model the missing value processing and traffic forecasting tasks, considering both local Spatio-temporal features and global historical patterns in an attention-based memory network. We propose as well a dynamic graph learning module based on the learned local-global features. The experimental results on real-life datasets show the reliability of our proposed method.
translated by 谷歌翻译
Parkinson's Disease (PD) is a progressive nervous system disorder that has affected more than 5.8 million people, especially the elderly. Due to the complexity of its symptoms and its similarity to other neurological disorders, early detection requires neurologists or PD specialists to be involved, which is not accessible to most old people. Therefore, we integrate smart mobile devices with AI technologies. In this paper, we introduce the framework of our developed PD early detection system which combines different tasks evaluating both motor and non-motor symptoms. With the developed model, we help users detect PD punctually in non-clinical settings and figure out their most severe symptoms. The results are expected to be further used for PD rehabilitation guidance and detection of other neurological disorders.
translated by 谷歌翻译